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The A-calculus is a peculiar computational model whose definition does not come with a notion of 
machine. Unsurprisingly, implementations of the A-calculus have been studied for decades. Abstract 
machines are implementations schema for fixed evaluation strategies that are a compromise between 
theory and practice: they are concrete enough to provide a notion of machine and abstract enough to 
avoid the many intricacies of actual implementations. There is an extensive literature about abstract 
machines for the A-calculus, and yet—quite mysteriously—the efficiency of these machines with 
respect to the strategy that they implement has almost never been studied. 

This paper provides an unusual introduction to abstract machines, based on the complexity of 
their overhead with respect to the length of the implemented strategies. It is conceived to be a 
tutorial, focusing on the case study of implementing the weak head (call-by-name) strategy, and yet 
it is an original re-elaboration of known results. Moreover, some of the observation contained here 
never appeared in print before. 


1 Cost Models & Size-Explosion 


The A-calculus is an undeniably elegant computational model. Its definition is given by three constructors 
and only one computational rule, and yet it is Turing-complete. A charming feature is that it does not 
rest on any notion of machine or automaton. The catch, however, is that its cost model are far from being 
evident. What should be taken as time and space measures for the A-calculus? The natural answers 
are the number of computational steps (for time) and the maximum size of the terms involved in a 
computation (for space). Everyone having played with the A-calculus would immediately point out 
a problem: the A-calculus is a nondeterministic system where the number of steps depends much on 
the evaluation strategy, so much that some strategies may diverge when others provide a result (but 
fortunately the result, if any, does not depend on the strategy). While this is certainly an issue to address, 
it is not the serious one. The big deal is called size-explosion, and it affects all evaluation strategies. 


Size-Explosion. There are families of terms where the size of the n-th term is linear in n, evaluation 
takes a linear number of steps, but the size of the result is exponential in n. Therefore, the number of 
steps does not even account for the time to write down the result, and thus at first sight it does not look 
as a reasonable cost model. Let’s see examples. 

The simplest one is a variation over the famous looping A-term Q := (Ax.xx)(Ax.xx) >g Q >g .... 
In Q there is an infinite sequence of duplications. In the first size-exploding family there is a sequence of 
n nested duplications. We define both the family {t,},cn of size-exploding terms and the family {uy }nen 
of results of the evaluation 

fo = y uo = y 
tny 5 (Ax.xx)tn Unt, = Unun 

We use |r| for the size of a term, i.e. the number of symbols to write it, and say that a term is neutral 

if it is normal and it is not an abstraction. 
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2 The Complexity of Abstract Machines 


Proposition 1.1 (Open and Rightmost-Innermost Size-Explosion). Let n € N. Then t, 8 Un, Moreover 
la| = O(n), |Un| = Q(2”), and uy is neutral. 


Proof. By induction on n. The base case is immediate. The inductive case: t)41 = (Ax.xx)ty >$ 
(Ax.xx)Un +B UnUn = Un+1, Where the first sequence is obtained by the i.h. The bounds on the sizes 
are immediate, as well as the fact that un+1 is neutral. O 


Strategy-Independent Size-Explosion. The example relies on rightmost-innermost evaluation (i.e. 
the strategy that repeatedly selects the rightmost-innermost B-redex) and open terms (the free variable 
to = y). In fact, evaluating the same family in a leftmost-outermost way would produce an exponentially 
long evaluation sequence. One may then believe that size-explosion is a by-product of a clumsy choice 
for the evaluation strategy. Unfortunately, this is not the case. It is not hard to modify the example as to 
make it strategy-independent, and it is also easy to get rid of open terms. Let the identity combinator be 
I := Az.z (it can in fact be replaced by any closed abstraction). Define 


rı := Ax.dy.(yxx) po:=1 
Pn = ÀX. (Ta (Ay. (yxx))) Pui = Ay-(YPnPn) 


The size-exploding family is {r,J}nen, i.e. it is obtained by applying r, to the identity J = po. The 
statement we are going to prove is in fact more general, about r,,p,, instead of just r,/, in order to obtain 
a simple inductive proof. 


Proposition 1.2 (Closed and Strategy-Independent Size-Explosion). Letn>0. Then rnapm nd: Pnim and 
in particular r,l —* Pn. Moreover, \rnl| = O(n), |pn| = Q(2"), ral is closed, and pn is normal. 


Proof. By induction on n. The base case: rı Pm = Ax.Ay.(yxx) Pm >p (Ay.(yPmPm)) = Pm+1- The induc- 


tive case: rn+1Pm = AX.(tn(Ay.(yxx))) Pm +g tn(Ay-(YPmPm)) = 1nPm+1 >$ Pnim+1, Where the second 
sequence is obtained by the i.h. The rest of the statement is immediate. O 


The family {r,/},en is interesting because no matter how one looks at it, it always explodes: if 
evaluation is weak (i.e. it does not go under abstraction) there is only one possible derivation to normal 
form and if it is strong (i.e. unrestricted) all derivations have the same length (and are permutatively 
equivalent). To our knowledge this family never appeared in print before. 


2 The A-Calculus is Reasonable, Indeed 


Surprisingly, the isolation and the systematic study of the size-explosion problem is quite recent—there 
is no trace of it in the classic books on the A-calculus, nor in any course notes we are aware of. Its 
essence, nonetheless, has been widespread folklore for a long time: in practice, functional languages 
never implement full B-reduction, considered a costly operation, and theoretically the A-calculus is usu- 
ally considered a model not suited for complexity analyses. 

A way out of the issue of cost models for the A-calculus, at first sight, is to take the time and 
space required for the execution of a A-term in a fixed implementation. There is however no canonical 
implementation. The design of an implementation in fact rests on a number of choices. Consequently, 
there are a number of different but more or less equivalent machines taking a different number of steps 
and using different amounts of space to evaluate a term. Fixing one of them would be arbitrary, and, 
most importantly, would betray the machine-independent spirit of the A-calculus. 
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Micro-Step Operational Semantics. Luckily, the size-explosion problem can be solved in a machine- 
independent way. Somewhat counterintuitively, in fact, the number of B-steps can be taken as a rea- 
sonable cost model. The basic idea is simple: one has to step out of the A-calculus, by switching to a 
different setting that mimics B-reduction without literally doing it, acting on compact representations of 
terms to avoid size-explosion. Essentially, the recipe requires four ingredients: 


1. Statics: A-terms are refined with a form of sharing of subterms; 

2. Dynamics: evaluation has to manipulate terms with sharing via micro-operations; 
3. Cost: these micro-step operations have constant cost; 

4. Result: micro-evaluation stops on a shared representation of the result. 


The recipe leaves also some space for improvisation: A-calculus can in fact be enriched with first-class 
sharing in various ways. Mainly, there are three approaches: abstract machines, explicit substitutions, 
and graph rewriting. They differ in the details but not in the essence—they can be grouped together 
under the slogan micro-step operational semantics. 


Reasonable Strategies. An evaluation strategy — for the A-calculus is reasonable if there is a micro- 
step operational semantics M mimicking — and such that the number of micro-steps to evaluate a term 
t is polynomial in the number of —>-steps to evaluate ¢ (and in the size of t, we will come back to this 
point later on). If a strategy — is reasonable then its length is a reasonable cost model, despite size- 
explosion: the idea is that the A-calculus is kept as an abstract model, easy to define and reason about, 
while complexity-concerned evaluation is meant to be performed at the more sophisticated micro-step 
level, where the explosion cannot happen. 

Of course, the design of a reasonable micro-step operational semantics depends much on the strategy 
and the chosen flavor of micro-steps semantics, and it can be far from easy. For weak strategies—used 
to model functional programming languages—reasonable micro-steps semantics are based on a simple 
form of sharing. The first result about reasonable strategies was obtained by Blelloch and Greiner in 1995 
and concerns indeed a weak strategy, namely the call-by-value one. At the micro-step level it relies 
on abstract machines. Similar results were then proved again, independently, by Sands, Gustavsson, and 
Moran in 2002 and by Dal Lago and Martini in 2006 [12]. For strong strategies—at work in proof 
assistant engines—quite more effort and care are required. A sophisticated second-level of sharing, called 
useful sharing, is necessary to obtain reasonable micro-step semantics for strong evaluation. The first 
such semantics has been introduced by Accattoli and Dal Lago in 2014 for the leftmost-outermost 
strategy, and its study is still ongoing [7Ļ2]. 


The Complexity of Abstract Machines. To sum up, various techniques, among which abstract ma- 
chines, can be used to prove that the number of B-steps is a reasonable time cost model, i.e. a metric for 
time complexity. The study can then be reversed, exploring how to use this metric to study the relative 
complexity of abstract machines, that is, the complexity of the overhead of the machine with respect 
to the number of B-steps. Such a study leads to a new quantitative theory of abstract machines, where 
machines can be compared and the value of different design choices can be measured. The rest of the 
paper provides a gentle introduction to the basic concepts of the new complexity-aware theory of ab- 
stract machines being developed by the author in joint works [3} {6} with Damiano Mazza, Pablo 
Barenbaum, and Claudio Sacerdoti Coen, and resting on tools and concepts developed beforehand in 
collaborations with Delia Kesner [9] and Ugo Dal Lago [8], as well as Kesner plus Eduardo Bonelli and 
Carlos Lombardi [5]. 


4 The Complexity of Abstract Machines 


Case Study: Weak Head Strategy. The paper focuses on a case study, the weak head (call-by-name) 
strategy, also known as weak head reduction (we use reduction and strategy as synonymous, and prefer 
Strategy), and defined as follows: 
(nus — (@1) (1) 

This is probably the simplest possible evaluation strategy. Of course, it is deterministic. Let us mention 
two other ways of defining it, as they will be useful in the sequel. First, the given inductive definition 
can be unfolded into a single synthetic rule (Ax.t)ur, ... rg wn t{x<u}r) ...r~. Second, the strategy can 
be given via evaluation contexts: define E := (-) | Er and define yy as E ((Ax.t)u) wn E (t{x<u}) 
(where E (t) is the operation of plugging t in the context E, consisting in replacing the hole (-) with f). 

Sometimes, to stress the modularity of the reasoning, we will abstract the weak head strategy into a 
generic strategy —. Last, a derivation is a possibly empty sequence of rewriting steps. 


3 Introducing Abstract Machines 


Tasks of Abstract Machines. An abstract machine is an implementation schema for an evaluation 
strategy — with sufficiently atomic operations and without too many details. A machine for — accounts 
for 3 tasks: 


1. Search: searching for —+-redexes; 
2. Substitution: replace meta-level substitution with an approximation based on sharing; 


3. Names: take care of o-equivalence. 


Dissecting Abstract Machines. To guide the reader through the different concepts to design and an- 
alyze abstract machines, the next two subsections describe in detail two toy machines that address in 
isolation the first two mentioned tasks, search and substitution. They will then be merged into the Milner 
Abstract Machine (MAM). In Sect.[7|we will analyze the complexity of the MAM. Next, we will address 
names and describe the Krivine Abstract Machine, and quickly study its complexity. 


Abstract Machines Glossary. 
e An abstract machine M is given by states, noted s, and transitions between them, noted ~>; 


e A state is given by the code under evaluation plus some data-structures to implement search and 
substitution, and to take care of names; 


e The code under evaluation, as well as the other pieces of code scattered in the data-structures, are 
A-terms not considered modulo a-equivalence; 


e Codes are over-lined, to stress the different treatment of a@-equivalence; 
e A code f is well-named if x may occur only in T (if at all) for every sub-code Ax.ū of f; 
e A state s is initial if its code is well-named and its data-structures are empty; 


e Therefore, there is a bijection -° (up to œ) between terms and initial states, called compilation, 
sending a term f on the initial state f° on a well-named code a@-equivalent to t; 


e An execution is a (potentially empty) sequence of transitions s’ ~>* s from an initial state s’ obtained 
by compiling a(n initial) term fo; 
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A state s is reachable if it can be obtained as the end state of an execution; 


A state s is final if it is reachable and no transitions apply to s. 


A machine comes with a map - from states to terms, called decoding, that on initial states is the 
inverse (up to œ) of compilation; 


A machine M has a set of B-transitions that are meant to be mapped to B-redexes (and whose name 
involves B) by the decoding, while the remaining overhead transitions are mapped on equalities; 


We use |p| for the length of an execution p, and |p|g for the number of B-transitions in p. 


Implementations. For every machine one has to prove that it correctly implements the strategy it was 
conceived for. Our notion, tuned towards complexity analyses, requires a perfect match between the 
number of f -steps of the strategy and the number of B-transitions of the machine execution. 


Definition 3.1 (Machine Implementation). A machine M implements a strategy — on A -terms when given 
a À-term t the following holds 


1. Executions to Derivations: for any M-execution p : t° ~>y s there exists a —>-derivation d : t —* s. 
* 


2. Derivations to Executions: for every + -derivation d : t —>* „u there exists a M-execution p :t° ~yS 
such that s = u. 


3. B-Matching: in both previous points the number |p |g of B-transitions in p is exactly the length |d| 
of the derivation d, i.e. |d| = |p |g. 

Note that if a machine implements a strategy than the two are weakly bisimilar, where weakness is 
given by the fact that overhead transitions do not have an equivalent on the calculus (hence their name). 
Let us point out, moreover, that the B-matching requirement in our notion of implementation is unusual 
but perfectly reasonable, as all abstract machines we are aware of do satisfy it. 


4 The Searching Abstract Machine 


Strategies are usually specified through inductive rules as those in (I). The inductive rules incorporate in 
the definition the search for the next redex to reduce. Abstract machines make such a search explicit and 
actually ensure two related subtasks: 


1. Store the current evaluation context in appropriate data-structures. 
2. Search incrementally, exploiting previous searches. 


For weak head reduction the search mechanism is basic. The data structure is simply a stack 7 storing 
the arguments of the current head subterm. 


Searching Abstract Machine. The searching abstract machine (Searching AM) in Fig.[Ihas two com- 
ponents, the code in evaluation position and the argument stack. The machine has only two transitions, 
corresponding to the rules in (1), one B-transition (~»,g) dealing with B-redexes in evaluation position 
and one overhead transition (~»@;) adding a term on the argument stack. Compilation of a (well-named) 
term ft into a machine state simply sends f to the initial state (f,€). The decoding given in Fig. [I] is 
defined inductively on the structure of states. It can equivalently be given contextually, by associating 
an evaluation context to the data structures—in our case sending the argument stack 7 to a context m% by 
setting £ := (-), @:: m := ((-)u), and (f,T) := Z(t). It is useful to have both definitions since sometimes 


one is more convenient than the other. 


6 The Complexity of Abstract Machines 


Stacks m := €|f::a | Decoding (t,£) 
Compilation t° := (f,€) (weg), := (fu, T) 


Figure 1: Searching Abstract Machine (Searching AM). 


Implementation. We now show the implementation theorem for the Searching AM with respect to the 
weak head strategy. Despite the simplicity of the machine, we provide a quite accurate account of the 
proof of the theorem, to be taken as a modular recipe. The proofs of the other implementation theorems 
in the paper will then be omitted as they follow exactly the same structure, mutatis mutandis. 

The executions-to-derivations part of the implementation theorem always rests on a lemma about the 
decoding of transitions, that in our case takes the following form. 


Lemma 4.1 (Transitions Decoding). Let s be a Searching AM state. 
1. P-Transition: ifs ~>,g s' then s >p s’. 


2. Overhead Transition: ifs ~~@; s’ then s = s. 


Proof. The first point is more easily proved using the contextual definition of decoding. 
Tn(àx.t) = n (àx.t)ju) >p T(t{x-u}) = s. 
2. 8 San) = (Tū, T) =s. O 


1. s= (Àxt,ūu:: n) =u 


Transitions decoding extends to a projection of executions to derivations (via a straightforward in- 
duction on the length of the execution), as required by the implementation theorem. For the derivations- 
to-executions part of the theorem, we proceed similarly, by first proving that single weak head steps are 
simulated by the Searching AM and then extending the simulation to derivations via an easy induction. 
There is a subtlety, however, because, if done naively, one-step simulations do not compose. 

Let us explain the point. Given a step £ +p u there exists a state s such that f° ~+@)~»,g $ and s = u, 
as expected. This property, however, cannot be iterated to build a many-steps simulation, because s = u 
does not imply s = u°, i.e. s in general is not the compilation of u. To make things work, the simulation of 
t —>wn u should not start from f° but from a state s’ such that s’ = t. Now, the proof of the step simulation 
lemma we just described relies on the following three properties: 


Lemma 4.2 (Bricks for Step Simulation). 
1. Vanishing Transitions Terminate: ~~ @; terminates; 
2. Determinism: the Searching AM is deterministic; 
3. Progress: final Searching AM states decode to —>w„p-normal terms. 
Proof. Termination: ~»@j-sequences are bound by the size of the code. Determinism: ~, and ~»@) 


clearly do not overlap and can be applied in a unique way. Progress: final states have the form (Àx.f,€) 
and (x, 7), that both decode to —,,,-normal forms. O 
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Environments E := €| [x<f]::E Decoding (t,£) <= t 
Compilation t° (t,£) (t, x-i]: E) := (t{x-0},E) 


Code | Ew |in] Coe | Em _] 


(Ax.tJUT: .. -Tk 


i [EE ee r Eiee 


where 7” denotes 7 =e Bound names have a freshly renamed. 


Figure 2: Micro-Substituting Abstract Machine (Micro AM). 


Lemma 4.3 ra vey Simulation). Let s be a Searching AM state. If s —„n u then there exists a state s' 
such that s ~ ~~ pS! and s' = u. 


Proof. Let nf @;(s) be the normal form of s with respect to ~>@;, that exists and is unique by termi- 
nation of ~»@; (Lemma|4.2[1) and determinism of the machine (Lemma 4.22). Since ~~ @; is mapped 
on identities (Lemma [4.1]2) one has nf @;(s) = s. By hypothesis s —„n-reduces, so that by progress 
(Lemma [4.2]3) nf @;(s) cannot be final. Then nf @;(s) ~»;g s’, and nf @;(s) = s —>wn 5! by the one-step 
simulation lemma (Lemmal/4.1[1). By determinism of —,,,, one obtains s’ = u. O 


Finally, we obtain the implementation theorem. 


Theorem 4.4. The Searching AM implements the weak head strategy. 


Proof. Executions to Derivations: by induction on the length |p| of p using Lemma[4.1] Derivations to 
Executions: by induction on the length |d| of d using Lemmal[4.3]and noting that t° =t. O 


5 The Micro-Substituting Abstract Machine 


Decomposing Meta-Level Substitution. The second task of abstract machines is to replace meta-level 
substitution {xm} with micro-step substitution on demand, i.e. a parsimonious approximation of meta- 
level substitution based on: 


1. Sharing: when a B-redex (Ax.f)i is in evaluation position it is fired but the meta-level substitution 
t{x<7} is delayed, by introducing an annotation [x7] in a data-structure for delayed substitutions 
called environment; 


2. Micro-Step Substitution: variable occurrences are replaced one at a time; 


3. Substitution on Demand: replacement of a variable occurrence happens only when it ends up in 
evaluation position—variable occurrences that do not end in evaluation position are never substi- 
tuted. 


The purpose of this section is to illustrate this process in isolation via the study of a toy machine, the 
Micro-Substituting Abstract Machine (Micro AM) in Fig. P] forgetting about the search for redexes. 


8 The Complexity of Abstract Machines 


Environments. We are going to treat environments in an unusual way: the literature mostly deals 
with local environments, to be discussed in Sect. [9] while here we prefer to first address the simpler 
notion of global environment, but to ease the terminology we will simply call them environments. So, 
an environment E is a list of entries of the form [x—7]. Each entry denotes the delayed substitution of 
u for x. Ina state (t, E’ :: [xu] :: E”) the scope of x is given by f and F’, as it is stated by forthcoming 
Lemma The (global) environment models a store. As it is standard in the literature, it is a list, but 
the list structure is only used to obtain a simple decoding and a handy delimitation of the scope of its 
entries. These properties are useful to develop the meta-theory of abstract machines, but keep in mind 
that (global) environments are not meant to be implemented as lists. 


Code. The code under evaluation is now a A-term AF1 ...7% expressed as a head A (that is either a B- 
redex (Ax.f)@ or a variable x) applied to k arguments—it is a by-product of the fact that the Micro AM 
does not address search. 


Transitions. There are two transitions: 
e Delaying B: transition ~»gg removes the B-redex (A.x.f)m but does not execute the expected sub- 
stitution {x—7}, it rather delays it, adding [x~i] to the environment. It is the B-transition of the 
Micro AM. 


e Micro-Substitution On Demand: if the head of the code is a variable x and there is an entry [x<7] 
in the environment then transition ~~,,, replaces that occurrence of x—and only that occurrence— 
with a copy of t. It is necessary to rename the new copy of f (into a well-named term) to avoid 
name clashes. It is the overhead transition of the Micro AM. 


Implementation. Compilation sends a (well-named) term ż to the initial state (f,€), as for the Search- 
ing AM (but now the empty data-structure is the environment). The decoding simply applies the delayed 
substitutions in the environment to the term, considering them as meta-level substitutions. 

The implementation of weak head reduction —,,, by the Micro AM can be shown using the recipe 
given for the Searching AM, and it is therefore omitted. The only difference is in the proof that the 
overhead transition ~~,,, terminates, that is based on a different argument. We spell it out because it will 
be useful also later on for complexity analyses. It requires the following invariant of machine executions: 


Lemma 5.1 (Name Invariant). Let s = (f,E) be a Micro AM reachable state. 
1. Abstractions: if Àx.ū is a subterm oft or of any code in E then x may occur only in T; 
2. Environment: if E = E' :: [xa] :: E" then x is fresh with respect to T and E". 
Proof. By induction on the length of the execution p leading to s. If p is empty then s is initial and the 


statement holds because f is well-named by hypothesis. If p is non-empty then it follows from the i.h. 
and the fact that transitions preserve the invariant, as an immediate inspection shows. O 


Lemma 5.2 (Micro-Substitution Terminates). ~>yar terminates in at most |E | steps (on reachable states). 


Proof. Consider a >ya transition copying T from the environment E’ :: [x—a] :: E”. If the next transition 
is again ~>yar, then the head of T is a variable y and the transition copies from an entry in E” because by 
Lemma[5.1]y cannot be bound by the entries in E’. Then the number of consecutive ~~,,, transitions is 
bound by E (that is not extended by ~~yq,). O 


Theorem 5.3. The Micro AM implements the weak head strategy. 
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Environments E := €| |[x<@]::E Decoding (t,£,£€) = t 
Stacks m c= eļ|ft:m (tu: 7,E) = (tū, 7,E) 
Compilation f° := (f,€,€) (t,€, x-0]: E) := (t{x-m},e,E) 


where 7“ denotes 7 where bound names have been freshly renamed. 


Figure 3: Milner Abstract Machine (MAM). 


6 Search + Micro-Substitution = Milner Abstract Machine 


The Searching AM and the Micro AM can be merged together into the Milner Abstract Machine (MAM), 
defined in Fig. [3] The MAM has both an argument stack and an environment. The machine has one B- 
transition ~B inherited from the Searching AM, and two overhead transitions, ~»@; inherited from the 
the Searching AM and ~~,,, inherited from the Micro AM. Note that in ~»,,, the code now is simply a 
variable, because the arguments are supposed to be stored in the argument stack. 

For the implementation theorem once again the only delicate point is to prove that the overhead 
transitions terminate. As for the Micro AM one needs a name invariant. A termination measure can 
then be defined easily by mixing the size of the codes (needed for ~>@;) and the size of the environment 
(needed for ~>yar), and it is omitted here, because it will be exhaustively studied for the complexity 
analysis of the MAM. Therefore, we obtain that: 


Theorem 6.1. The MAM implements the weak head strategy. 


7 Introducing Complexity Analyses 


The complexity analysis of abstract machines is the study of the asymptotic behavior of their overhead. 


Parameters for Complexity Analyses. Let us reason abstractly, by considering a generic strategy > 
in the A-calculus and a given machine M implementing —. By the derivations-to-executions part of the 
implementation (Definition B.1), given a derivation d : tg +” u there is a shortest execution P : t ~>y 
s such that s =u. Determining the complexity of M amounts to bound the complexity of a concrete 
implementation of p, say on a RAM model, as a function of two fundamental parameters: 


1. Input: the size |fo| of the initial term to of the derivation d; 


2. Strategy the length n = |d| of the derivation d, that coincides with the number |p|, of B-transitions 
in p by the B-matching requirement for implementations. 


Note that our notion of implementation allows to forget about the strategy while studying the complexity 
of the machine, because the two fundamental parameters are internalized: the input is simply the initial 
code and the length of the strategy is simply the number of B-transitions. 


10 The Complexity of Abstract Machines 


Types of Machines. The bound on the overhead of the machine is then used to classify it, as follows. 


Definition 7.1. Let M an abstract machine implementing a strategy —. Then 


e Mis reasonable if the complexity of Mis polynomial in the input |to| and the strategy |p |p; 


e Mis unreasonable if it is not reasonable; 


e Mis efficient if it is linear in both the input and the strategy (we sometimes say that it is bilinear). 


Recipe for Complexity Analyses. The estimation of the complexity of a machine usually takes 3 steps: 


1. Number of Transitions: bound the length of the execution p simulating the derivation d, usually 
having a bound on every kind of transition of M. 


2. Cost of Single Transitions: bound the cost of concretely implementing a single transition of M— 
different kind of transitions usually have different costs. Here it is usually necessary to go beyond 
the abstract level, making some (high-level) assumption on how codes and data-structure are con- 
cretely represented (our case study will provide examples). 


3. Complexity of the Overhead: obtain the total bound by composing the first two points, that is, by 
taking the number of each kind of transition times the cost of implementing it, and summing over 
all kinds of transitions. 


8 The Complexity of the MAM 


In this section we provide the complexity analysis of the MAM, from which analyses of the Searching 
and Micro AM easily follow. 


The Crucial Subterm Invariant. The analysis is based on the following subterm invariant. 


Lemma 8.1 (Subterm Invariant). Let p : tŷ ~mam (u,7,E) be a MAM execution. Then T and any code 
in T and E are subterms of to. 


Note that the MAM copies code only in transition ~»,,,, where it copies a code from the environment 
E. Therefore, the subterm invariant bounds the size of the subterms duplicated along the execution. 

Let us be precise about subterms: for us, @ is a subterm of tọ if it does so up to variable names, both 
free and bound (and so the distinction between terms and codes is irrelevant). More precisely: define 
t~ as t in which all variables (including those appearing in binders) are replaced by a fixed symbol x. 
Then, we will consider u to be a subterm of t whenever u` is a subterm of t~ in the usual sense. The key 
property ensured by this definition is that the size |z| of 7 is bounded by |f]. 


Proof. By induction on the length of p. The base case is immediate and the inductive one follows from 
the i.h. and the immediate fact that the transitions preserve the invariant. O 


The subterm invariant is crucial, for two related reasons. First, it linearly relates the cost of duplica- 
tions to the size of the input, enabling complexity analyses. With respect to the length of the strategy, 
then, micro-step operations have constant cost, as required by the recipe for micro-step operational se- 
mantics in Sect. P| Second, it implies that size-explosion has been circumvented: duplications are linear, 
and so the size of the state can grow at most linearly with the number of steps, i.e. it cannot explode. In 
particular, we also obtain the compact representation of the results required by the recipe. 
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The relevance of the subterm invariant goes in fact well beyond abstract machines, as it is typical 
of most instances of micro-step operational semantics. And for complexity analyses of the A-calculus 
it is absolutely essential, playing a role analogous to that of the cut-elimination theorem in the study of 
sequent calculi or of the sub-formula property for proof search. 


Number of Transitions. The next lemma bounds the global number of overhead transitions. For the 
micro-substituting transition ~»,,; it relies on an auxiliary bound of a more local form. For the searching 
transition ~>@; the bound relies on the subterm invariant. We denote with |p|g, |P|@i, and |P|,ar the 
number of ~»,g, ~>@7, and yar transitions in p, respectively. 


Lemma 8.2. Let p : tọ mam s be a MAM execution. Then: 


1. Micro-Substitution Linear Local Bound: if © : s >G; var 8” then |O|var < |E| = |p 


B 
2. Micro-Substitution Quadratic Global Bound: |P [var < |p 


rae 
Bp’ 
3. Searching (and B) Local Bound: if © : s oe ail s' then |o| < |to 


4. Searching Global Bound: |p|@: < |fo| - ({Plvar + 1) < |to| - (lo 5 +1). 


Proof. 


1. Reasoning along the lines of Lemma[5.2]one obtains that ~>,ar transitions in o have to use entries 
of E from left to right (~+@; and ~>yar do not modify E), and so |O|yar < |E|. Now, |E| is exactly 
|p |g, because the only transition extending E, and of exactly one entry, is ~>,g. 


2. The fact that a linear local bound induces a quadratic global bound is a standard reasoning. We 
spell it out to help the unacquainted reader. The execution p alternates phases of B-transitions and 
phases of overhead transitions, i.e. it has the shape: 


Or x F + * / * a $ $ 
to =] 7B S1 ~? @l,var $2 “rp $2 C @l,var Sk rp Sk @l,var S 


Let a; be the length of the segment s; ~B s; and b; be the number of ~>,ar transitions in the seg- 

ment s; Ovar Siti, for i = 1,...,k. By Point [] we obtain b; < Xi aj. Then |P |var = Ltb; < 

k Din Gy. Note that Liaj < Xia = IP |p and k < IP |p- So |Plvar < Yh Di 4; = 
Ei lels < |p lp 

3. The length of o is bound by the size of the code in the state s because ~+,g @; strictly decreases 


the size of the code, that in turn is bound by the size |fo| of the initial term by the subterm invariant 


(Lemma[8. 1). 


4. The execution p alternates phases of ~g and ~a; transitions and phases of ~>yar transitions, i.e. 
it has the shape: 


te — nay I var sar / any Sut / a ag 
0 5851 rB,@l sy var 52 rB,@l S2 “yar ++ Sk rp,@l Sk ~~ var rp,@l s 


By Point B]the length of the segments s; >B, , 5; is bound by the size |fo| of the initial term. 
The code may grow, instead, with ~>,ar transitions. So |p|@; is bound by |fo| times the number 
|P [var Of micro-substitution transitions, plus |f9] once more, because at the beginning there might 
be ~»,g @ transitions before any ~>var transition—in symbols, |p|@i < |fo| - (|P|var + 1). Finally, 
lol- (Plar + 1) < lol- (IPI + 1) by Point o 


12 The Complexity of Abstract Machines 


Cost of Single Transitions. To estimate the cost of concretely implementing single transitions we need 
to make some hypotheses on how the MAM is going to be itself implemented on RAM: 


1. Codes, Variable (Occurrences), and Environment Entries: abstractions and applications are con- 
structors with pointers to subterms, a variable is a memory location, a variable occurrence is a 
reference to that location, and an environment entry [xf] is the fact that the location associated to 
x contains (the topmost constructor of) f. 


2. Random Access to Global Environments: the environment E of the MAM can be accessed in 
constant time (in yar) by just following the reference given by the variable occurrence under 
evaluation, with no need to access E sequentially, thus ignoring its list structure. 


It is now possible to bound the cost of single transitions. Note that the case of ~par transitions relies 
on the subterm invariant. 


Lemma 8.3. Let p : tọ mam S$ be a MAM execution. Then: 
1. Each ~a; transition in p has constant cost; 
2. Each ~>rB transition in p has constant cost; 


3. Each >ya transition in p has cost bounded by the size |to| of the initial term. 


Proof. According to our hypothesis on the concrete implementation of the MAM, ~~ @; just moves the 
pointer to the current code on the left subterm of the application and pushes the pointer to the right 
subterm on the stack—evidently constant time. Similarly for ~+,g. For ~>yar, the environment entry 
[x<-f] is accessed in constant time by hypothesis, but f has to be a@-renamed, i.e. copied. It is not hard 
to see that this can be done in time linear in |f| (the naive algorithm for copying carries around a list of 
variables, and it is quadratic, but it can be easily improved to be linear) that by the subterm invariant 
(Lemma[8.1) is bound by the size |fo| of the initial term. O 


Complexity of the Overhead. By composing the analysis of the number of transitions (Lemma|8.2) 
with the analysis of the cost of single transitions (Lemma|8.3) we obtain the main result of the paper. 
Theorem 8.4 (The MAM is Reasonable). Let p : tọ mam S be a MAM execution. Then: 

1. ~»@, transitions in p cost all together O(|to| - (Ip ls +1)); 


2. ~»,g transitions in p cost all together O(|p |g); 
3. yar transitions in p cost all together O(|to| - (Ip | +1)); 


Then p can be implemented on RAM with cost O(|to| - (lelg +1)), i.e. the MAM is a reasonable imple- 
mentation of the weak head strategy. 


The Efficient MAM. According to the terminology of Sect. |3| the MAM is reasonable but it is not 
efficient because micro-substitution takes time quadratic in the length of the strategy. The quadratic 
factor comes from the fact that in the environment there can be growing chains of renamings, i.e. of 
substitutions of variables for variables, see [6] for more details on this issue. The MAM can actually 
be optimized easily, obtaining an efficient implementation, by replacing ~»,g with the following two 
B-transitions: 


i Wis not a variable 
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Search is Linear and the Micro AM is Reasonable. By Lemma[8.2]the cost of search in the MAM 
is linear in the number of transitions for implementing micro-substitution. This is an instance of a more 
general fact: search turns out to always be bilinear (in the initial code and in the amount of micro- 
substitutions). There are two consequences of this general fact. First, it can be turned into a design 
principle for abstract machines—search has to be bilinear, otherwise there is something wrong in the 
design of the machine. Second, search is somewhat negligible for complexity analyses. 

The Micro AM can be seen as the MAM up to search. In particular, it satisfies a subterm invariant 
and thus circumvents size-explosion. The Micro AM is however quite less efficient, because at each 
step it has to search the redex from scratch. An easy but omitted analisys shows that its overhead is 
nonetheless polynomial. Therefore, it makes sense to consider very abstract machines as the Micro AM 
that omit search. In fact, they already exist, in disguise, as strategies in the linear substitution calculus 
(1) [5], a recent approach to explicit substitutions modeling exactly micro-substitution without search (the 
traditional approach to explicit substitutions instead models both micro-substitution and search) and they 
were used for the first proof that a strong strategy (the leftmost-outermost one) is reasonable [10]. 


The Searching AM is Unreasonable. It is not hard to see that the Searching AM is unreasonable. 
Actually, the number of transitions is reasonable. The projection of MAM executions on Searching AM 
executions, indeed, shows that the number of searching transitions of the Searching AM is reasonable. 
It is the cost of single transitions that becomes unreasonable. In fact, the Searching AM does not have a 
subterm invariant, because it rests on meta-level substitution, and the size of the terms duplicated by the 
~+,g transition can explode (it is enough to consider the size-exploding family of Proposition 1.2). 

The moral is that micro-substitution is more fundamental than search. While the cost of search can 
be expressed in terms of the cost of micro-substitution, the converse is in fact not possible. 


9 Names: Krivine Abstract Machine 


Accounting for Names. In the study presented so far we repeatedly took names seriously, by dis- 
tinguishing between terms and codes, by asking that initial codes are well-named, and by proving an 
invariant about names (Lemma|5.1). The process of a@-renaming however has not been made explicit, 
the machines we presented rather rely on a meta-level renaming, used as a black box. 

The majority of the literature on abstract machines, instead, pays more attention to @-equivalence, 

or rather to how to avoid it. We distinguish two levels: 

1. Removal of on-the-fly a-equivalence: in these cases the machine works on terms with variable 
names but it is designed in order to implement evaluation without ever @-renaming. Technically, 
the global environment of the MAM is replaced by many local environments, each one for every 
piece of code in the machine. The machine becomes more complex, in particular the non-trivial 
concept of closure (to be introduced shortly) is necessary. 


2. Removal of names: terms are represented using de Bruijn indexes (or de Bruijn levels), removing 
the problem of a&-equivalence altogether but sacrificing the readability of the machine and reducing 
its abstract character. Usually this level is built on top of the previous one. 

We are now going to introduce Krivine Abstract Machine (keeping names, so at the first level), yet 
another implementation of the weak head strategy. Essentially, it is a version of the MAM without on- 
the-fly a@-equivalence. The complexity analysis will show that it has exactly the same complexity of 
the MAM. The further removal of names is only (anti)cosmetic—the complexity is not affected either. 
Consequently, the task of accounting for names is—as for search—negligible for complexity analyses. 


14 The Complexity of Abstract Machines 


ps Env. e = E | [xec] :: e Closure Decoding (t,e) := t 
a A Glee) = Feche) 

acks M E Elona State Decoding (ce) := 2 
States s := (c,2) (ccum) := ((ec!,€),x) 
Compilation f° := ((f,€),€) —— 


(x) = Ge’) 


Figure 4: Krivine Abstract Machine (KAM). 


Krivine Abstract Machine. The machine is in Fig. {4| It relies on the mutually inductively defined 
concepts of local environment, that is a list of closures, and closure, that is a list of pairs of a code and a 
local environment. A state is a pair of a closure and a stack, but in the description of the transitions we 
write it as a triple, by spelling out the two components of the closure. Let us explain the name closure: 
usually, machines are executed on closed terms, and then a closure decodes indeed to a closed term. 
While essential in the study of call-by-value or call-by-need strategies, for the weak head (call-by-name) 
strategy the closed hypothesis is unnecessary, that is why we do not deal with it—so a closure here does 
not necessarily decode to a closed term. Two remarks: 


1. Garbage Collection: transition ~œyar, beyond implementing micro-substitution, also accounts for 
some garbage collection, as it throws away the local environment e associated to the replaced 
variable x. The MAM simply ignores garbage collection. For time analyses garbage collection can 
indeed be safely ignored, while it is clearly essential for space (both the KAM and the MAM are 
however desperately inefficient with respect to space). 


2. No a-Renaming and the Length of Local Environments: names are never renamed. The initial 
code, as usual, is assumed to be well-named. Then the entries of a same local environment are all 
on distinguished names (formally, a name invariant holds). Then the length of a local environment 
e is bound by the number of names in the initial term, that is, by the size of the initial term 
(formally, |e] < |fo|). This essential quantitative invariant is used in analisys of the next paragraph. 


Implementation and Complexity Analysis. The proof that the KAM implements the weak head strat- 
egy follows the recipe for these proofs and it is omitted. For the complexity analysis, the bound of the 
number of transitions can be shown to be exactly as for the MAM. A direct proof is not so simple, be- 
cause the bound on ~~,,, transitions cannot exploit the size of the global environment. The bound can 
be obtained by relating the KAM with the Searching AM (for which the exact same bound of the MAM 
holds), or by considering the depth (i.e. maximum nesting) of local environments. The proof is omitted. 

The interesting part of the analysis is rather the study of the cost of single transitions. As for the 
MAM, we need to spell out the hypotheses on how the KAM is concretely implemented on RAM. 
Variables cannot be implemented with pointers, because the same variable name can be associated to 
different codes in different local environments. So they have to simply be numbers. Then there are two 
choices for the representation of environments, either they are represented as lists or as arrays. In both 
cases ~~»,g can be implemented in constant time. For the other transitions: 
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ity 


1. Environments as Arrays: we mentioned that there is a bound on the length of local environments 
(\e| < |fo|) so that arrays can be used. The choice allows to implement ~~+,,, in constant time, 
because e can be accessed directly at the position described by the number given by x. Transition 
~~@, however requires to duplicate e, and this is necessary because the two copies might later on 
be modified differently. So the cost of a ~a; transition becomes linear in |fo| and ~~ @, transitions 
all together cost O(|to|? - (| Plg + 1)), that also becomes the complexity of the whole overhead of 
the KAM. This is worse than the MAM. 


2. Environments as Lists: implementing local environments as lists provides sharing of environments, 
overcoming the problems of arrays. With lists, transition ~>@; becomes constant time, as for the 
MAM, because the copy of e now is simply the copy of a pointer. The trick is that the two copies 
of the environment can only be extended differently on the head, so that the tail of the list can be 
shared. Transition ~>yar however now needs to access e sequentially, and so it costs |to| as for the 
MAM. Thus globally we obtain the same overhead of the MAM. 

Summing up, names can be pushed at the meta-level (as in the MAM) without affecting the complex- 
of the overhead. Thus, names are even less relevant than search at the level of complexity. The moral 


of this tutorial then is that substitution is the crucial aspect for the complexity of abstract machines. 
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